Assignment as part of Shiny-based Visual Analytics Application project
The Global Innovation Index (GII) 2020 analyzes key global innovation trends and presents a ranking of the innovation performance of 131 economies around the world.
With the Covid-19 pandemic, the question of “Who Will Finance Innovation?”, the theme of this year’s GII, is critical in solving the seemingly insuperable challenges ahead of us.
Framework of the GII 2020:
In this project, we aim to analyse and identify patterns regarding the GII during Covid-19 pandemic. We intend to draw conclusions from the data and generate visualization of the data for the respective countries or regions especially in Singapore.
For the scope of our project, our approaches include developing a R-Shiny application for an interactive (I) Exploratory Data Analysis which includes a Choropleth Map, Bubble Plot, Radar Chart and Time-Series Analysis and (II) Statistical Analysis which includes Box Plot, Correlation Analysis, Hierarchical Clustering, Scatter Plot and Violin Plot.
I will be covering the portion on Exploratory Data Analysis and my other teammate will cover the portion on Statistical Analysis.
Choropleth Map: It is the clearest visualization to identify innovation score patterns by geographic locations. The use of colours in geographical visualization is particularly useful to distinguish the most innovation countries from the least innovation countries. The darker the colour, the highest is the innovation score. Countries in grey do not have innovation data.
Radar Chart: Allow to compare the seven variables (Institutions, Human Capital and research, Infrastructure, Market Sophistication, Business Sophistication, Knowledge and technology outputs and Creative outputs) for Singapore and Switzerland (ranked first in the GII).
Time Series using Line Graph: A time-series visualization is crucial to view innovation trends of countries between the years of 2015 to 2020. Therefore, the line graph was the best way to present such data clearly.
Bubble Plot: It is the clearest measure to show the relationship between innovation inputs (Institutions, Human Capital and research, Infrastructure, Market Sophistication, Business Sophistication) and innovation outputs (Knowledge and technology outputs, and Creative outputs).
Import relevant packages into R. The R packages will be used to read the data and plot the visualization.
packages = c('tidyverse', 'sf', 'tmap', 'countrycode', 'plotly', 'reshape2', 'leaflet' )
for (p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
The raw dataset is obtained from https://www.globalinnovationindex.org/analysis-indicator and was cleaned for visualisation and analysis by my teammate, Lance Teo. Data cleaning steps can be found in his article post on GII statistical analysis.
The CSV file of the dataset is loaded into the variable df using the function read_csv.
gii_df_wide_score<-read_csv("data/gii_dataset_2015-2020_wide_score.csv")
In order to convert the data from a raw form into predefined data structures for making it more suitable for analysis, below are the pre-processing methods that encapsulate the entire process.
Shapefile encode points,lines, and polygons in geographic space. Shapefile appears with a .shp extension. These shape files can be obtained from Thematic Mapping.
shp <- st_read(dsn = "data/shape_files",
layer = "TM_WORLD_BORDERS-0.3")
Reading layer `TM_WORLD_BORDERS-0.3' from data source `C:\ElaineVisualAnalytics\blog\_posts\2021-04-10-gii-exploratory-and-analysis\data\shape_files' using driver `ESRI Shapefile'
Simple feature collection with 246 features and 11 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -180 ymin: -90 xmax: 180 ymax: 83.6236
geographic CRS: WGS 84
In order to know which region each country belongs to, we need to map each country to its region by using countrycode (package in R). The Countrycode function can convert to and from several different country coding schemes.
gii_df_wide_score <- gii_df_wide_score %>%
mutate(iso3 = countrycode(Country, origin = 'country.name.en', destination = 'iso3c')) %>%
mutate(region = countrycode(Country, origin = 'country.name.en', destination = 'un.region.name'))
Using a left_join to join shape file to dataset using International Organisation for Standardization (“ISO”) variable, and ISO3 refers to a unique 3-character code that represents each individual country. This will create a join for all dataset for filtered year 2020.
To create a tmap object (tm_shape()) followed by a thematic layer (tm_polygons()). However, this choropleth map is static and does not allow users to drilldown for more information.
tm_shape(map_choropleth)+
tm_polygons("GLOBAL INNOVATION INDEX")

A visualization of the innovation scores is created in a world map using tmap. In order to produce an interactive map, leaflet (package in R) is used to create the final visualization. Customized leaflet choropleth map includes adding a legend with addLegend() and adding a tooltip with labelOptions to display the country name and GII Score when user hover a specific country.
bins <- c(0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100)
pal <- colorBin("Blues", domain = map_choropleth$`GLOBAL INNOVATION INDEX`, bins = bins)
labels <- sprintf(
"<strong>%s</strong><br/>%g",
map_choropleth$Country, map_choropleth$`GLOBAL INNOVATION INDEX`
) %>% lapply(htmltools::HTML)
leaflet(map_choropleth) %>%
addProviderTiles("CartoDB.Positron") %>%
addPolygons(
fillColor = ~pal(map_choropleth$`GLOBAL INNOVATION INDEX`),
weight = 1,
opacity = 0.7,
color = "white",
dashArray = "3",
fillOpacity = 0.7,
highlight = highlightOptions(
weight = 2,
color = "#666",
dashArray = "",
fillOpacity = 0.7,
bringToFront = TRUE),
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto")) %>%
addLegend(pal = pal, values = ~map_choropleth$`GLOBAL INNOVATION INDEX`, opacity = 0.7, title = 'GII Score',
position = "bottomright")